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A stochastic density matrix approach to approximation of probability 
distributions and its application to nonlinear systems* 

Igor G. Vladimirov^ 

Abstract 

This paper outlines an approach to the approximation of probability density functions by quadratic forms of weighted 
orthonormal basis functions with positive semi-definite Hermitian matrices of unit trace. Such matrices are called stochastic 
density matrices in order to reflect an analogy with the quantum mechanical density matrices. The SDM approximation of a PDF 
satisfies the normalization condition and is nonnegative everywhere in contrast to the truncated Gram-Charlier and Edgeworth 
expansions. For bases with an algebraic structure, such as the Hermite polynomial and Fourier bases, the SDM approximation 
can be chosen so as to satisfy given moment specifications and can be optimized using a quadratic proximity criterion. We apply 
the SDM approach to the Fokker-Planck-Kolmogorov PDF dynamics of Markov diffusion processes governed by nonlinear 
stochastic differential equations. This leads to an ordinary differential equation for the SDM dynamics of the approximating 
PDF. As an example, we consider the Smoluchowski SDE on a multidimensional torus. 

I. INTRODUCTION 

Practical solvability of performance analysis and control design problems for stochastic systems often depends on tractabil- 
ity of relevant quantities, such as moments of the system variables. For example, the Kalman filtering and Linear Quadratic 
Gaussian control theories [1] substantially employ the preservation of Gaussian nature of probability distributions of the state 
variables governed by linear SDKs. Under the linear dynamics, the first and second order moments of the variables (and 
more complicated functionals of Gaussian distributions) are amenable to a complete analysis. It is the convenience of linear 
Gaussian models that makes them so popular in filtering/control with quadratic and related (for example, risk-sensitive) 
performance criteria. These advantages motivate the approximation of a nonlinear stochastic system by an effective linear 
model which underlies the stochastic linearization techniques. The latter date back to [3], [5], [12] and have recently been 
extended to quantum stochastic systems [30]. 

A different approach to computing the statistical characteristics of a nonlinear stochastic system (oriented at approximating 
probability distributions rather than system dynamics) consists, for example, in representing the probability distribution of its 
state variables as a mixture of Gaussian distributions whose parameters evolve in time. In fact, mixed Gaussian distributions 
arise as exact posterior probability distributions in the Lainiotis multimodel filter [15], where the conditional Gaussian 
distributions from partial Kalman filters are weighted by recursively updated posterior probabilities of the corresponding 
linear models conditioned on the observations. This combination of a bank of Kalman filters with a “mixing” block is a 
recursive implementation of the Bayesian approach. The important property that the resulting mixture of Gaussian PDFs is 
a legitimate PDF, which is nonnegative everywhere and satisfies the normalization condition, does not always come with 
PDF approximations, in general. For example, the truncated Gram-Charlier and Edgeworth expansions [2], based on the 
Hermite polynomials, are not equipped with this feature, although they provide control over moments or cumulants up to 
an arbitrary given order. 

The aim of the present paper is to outline an approach to the approximation of PDFs by using quadratic forms of weighted 
complex-valued orthonormal basis functions with positive semi-definite Hermitian matrices of unit trace. These matrices are 
called stochastic density matrices (SDM) in order to emphasize an analogy (and, at the same time, avoid confusion) with 
the quantum mechanical density matrices [25]. The SDM approximation leads to a legitimate PDF which is nonnegative 
everywhere and satisfies the normalization condition. Furthermore, it retains the possibility to control the moments of the 
PDF for orthonormal bases with an algebraic structure, such as the Hermite polynomial and Fourier bases. The SDM 
approximation can be optimized by using a proximity criterion for PDFs based on the second-order relative Renyi entropy 
[23], which leads to a quadratic minimization problem. 

This allows the SDM approach to be applied to PDFs of Markov diffusion processes, governed by nonlinear SDEs, by 
reducing the approximate numerical integration of the Eokker-Planck-Kolmogorov equation (EPKE) [28] to the solution of 
an ODE for the SDM, which resembles the Galerkin approximations for parabolic PDEs [7]. As an illustration, we consider a 
Smoluchowski SDE [8], [11], [31] on a multidimensional torus, which provides an example of a nonlinear stochastic system 
with rotational degrees of freedom. The SDM approach admits a real version in the case of real-valued basis functions, with 
an appropriate reformulation of the results. It is relevant to mention a connection of this approach with the methods using 
the sum of squares (SOS) of polynomials for Lyapunov stability analysis and global optimization [16], [22]. However, the 
SDM approach, proposed in the present paper, serves a different purpose here and is not restricted to polynomials. 
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The paper is organised as follows. Section |II] describes the class of PDFs generated by an SDM and a set of orthonormal 
functions. Section |III] relates the algebraic structure of the orthonormal basis to the moments of such PDFs. Section |IV] 
discusses effective parameters of the SDM which control the PDF. Section |V] specifies this class of PDFs for the multivariate 
Hermite polynomial and Fourier bases. Section |Vl] describes the SDM approximation of a given PDF using a quadratic 
criterion. Section IVIII extends the SDM approximation to PDF dynamics of Markov processes. Section IVIIII considers the 
Smoluchowski SDE on a multidimensional torus. Section HXl reformulates the corresponding FPKE in the spatial frequency 
domain. Section describes the SDM approximation of the PDF dynamics using the Fourier basis and provides numerical 
results. Section IXJ makes concluding remarks. 

II. STOCHASTIC DENSITY MATRIX 

Suppose G is a finite-dimensional state space of a dynamical system. To be specific, we assume that G is a domain in 
M", or an «-dimensionaI torus T". In what follows, we use a complex Hilbert space 

:= v) := {/: G ^ C : ^ |/(x)pvWdx < +-} (1) 

of square integrable complex-valued functions on the set G with a weight v : G —>■ (0,-|-oo). The norm jjfjjje’ '■= yj (/,/)jr 
in the space is generated by the inner product 

■= f f{x)gix)v{x)(h:, (2) 

JG 

where the integral is over the standard «-variate Lebesgue measure, and (•) is the complex conjugate. Eurthermore, we 
denote by {X,Y) := Tr(X*y) the Erobenius inner product [9] of complex matrices X and Y, which reduces to Tr(Xy) for 
complex Hermitian matrices (the real Hilbert space of such matrices of order r is denoted by Hr, with H+ the corresponding 
set of positive semi-definite matrices). Also, (•)* := ((•))^ is the complex conjugate transpose, and vectors are organised as 
columns unless specified otherwise. Suppose {(Pk)keQ. £ is a fixed but otherwise arbitrary orthonormal basis in the Hilbert 
space O, whose elements are indexed by a denumerable set H. The latter is assumed to be a subset of a multidimensional 
integer lattice which contains the origin: 0 G H. Therefore, 

= j,keQ., (3) 

where 5jk is the Kronecker delta. Now, suppose A is a given nonempty finite subset of £2 consisting of N :=#A elements. 
Also, let S := {sjk)j,keA £ be a matrix of unit trace Tr5 := Y.keA^kk = 1- Such matrices form a convex subset of the 
space Hat which we denote by 

6iv:={5GH+: Tr5=l}. (4) 

With the orthonormal basis {(Pk)keQ. in ^ the set A being fixed, we associate with S € &n a real-valued function 
p : G^M. defined as a quadratic form 

p{x):=v{x) ^ Sjk(Pj{x)(pk{x) = v{x)<Y>{x)*S<Y>{x) 
j,keA 

=v{x){SMxMxr), xGG, (5) 

where <I>: G —is a vector-valued map formed from the basis functions as 

4>(x) := {(Pk{x))keA- (6) 

Lemma 1: Eor any matrix S G &n from the set (|4]i, the corresponding function p, defined by (|5]l and (|6]l, is a PDE on the 
set G. 

Proof: The positive semi-definiteness 5 )= 0 and the positiveness v > 0 in (|5ll imply that p{x) ^ 0 for all x (H G. 
Eurthermore, the orthonormality (|3ll and the assumption Tr5 = 1 imply that the function p satisfies the normalization 
condition f(jp{x)dx = T.j^keA^jki^PjT^Pk}^ = Tr^ = 1, and hence, p is indeed a PDE. ■ 

Due to its properties and the role in the construction (|5]l of a legitimate PDE, the matrix S resembles the density matrices 
in quantum mechanics [21], [25]. In order to reflect this analogy (and, at the same time, avoid confusion) with the quantum 
mechanical density matrices and the stochastic matrices of transition probabilities of classical Markov chains [26], we will 
refer to 5 as a stochastic density matrix (SDM). Since the set 6n of SDMs 5 in (|4]i is convex and the PDE p in (|5]l depends 
linearly on S, the set of such PDEs is also convex. If the Hilbert space in ([T]i is restricted to real-valued functions, then the 
SDM becomes a real symmetric matrix, and the results that follow can be appropriately reformulated for this case. 
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III. ALGEBRAIC STRUCTURE AND MOMENTS 

Eor what follows, we assume that = T and hence, in view of Q, the weighting function v satisfies /gV(x)cb: = 1 
and is itself a PDE. This allows the inner product (|2]i to be represented as {f,g)^ = Ev(/g), where Ev(-) denotes the 
expectation over the PDE v. Also, suppose the basis functions are algebraically closed in the sense that there exist complex 
constants eju such that for any j,k G D., the representation 

(Pj{x)(Pk{x) = ^ ejucpiix) ( 7 ) 

ien 

holds for all X € G and contains only a finite number of terms with nonzero coefficients ejM- The coefficients ejkt in ©, 
which quantify the algebraic structure of the orthonormal basis {(pk)kea, can be computed as 

ejkl = {(Pe, = Ev {(PjWW) (8) 

and are symmetric with respect to the subscripts k, £. Eurthermore, ejki is real and invariant under arbitrary permutations 
of its subscripts in the real case mentioned at the end of Section |II] Eor any £ G Q, we define the ith structure matrix 

El := {ejke)j,keA £ by 

Ee-= {ejke)j^keA = E^viW^^*)j (9) 

where the expectation applies entry wise, and the map <I> is given by (|6]l. In particular, since (pQ= 1, then ([8]l implies that 
ejko = Sjk for all j,k G A, and hence, Eq = Lv is the identity matrix of order N. Since the left-hand side of (|7]) is the ( 7 ,A:)th 
entry of the matrix <l>(x)'l>(x)*, the algebraic closedness property is representable in a vector-matrix form: 

<I>(x)<I>(x)* = 52 (10) 

teU 

This series contains only a finite number of terms since the matrix Eg in (|9]l vanishes for all but finitely many indices i gD. 
which form the set 

l3:={iGQ.:Eg^0}. (11) 

This set, consisting of L := #15 elements, depends on the set A and contains 0 (since Eq = Ifg). Substitution of (fTOl i into the 

definition of the PDE p in (|5]l leads to 

p{x) = v(x) ^ {S,Ee)(pe{x) 
ieU 

= v(x) £ (Eg,s}'^ = v(x)C(5)*'P(x). (12) 

eeu 

Here, C: is a linear vector-valued map defined in terms of the nonzero structure matrices by 

C(S) := ({Eg,S})geU, (13) 

and, similarly to (|6]l, the map ‘P : G —is formed from the basis functions as 

‘P(x) := i(pe{x))geu- (14) 

The second equality in (fT2]) follows from p and v being real-valued. In view of this equality and Q, the expectations of 
the basis functions over the PDE p (which play the role of generalized moments of this PDE) are computed as 

EipiPm •— I pi^)tpmi^)^ — = (Effi^S) (15) 

Jg leu 

for any m G and vanish for all m G Q.\13 in view of (fTTI) . Therefore, (fTSl i describes the series expansion of the PDF p 
from 0 over the orthonormal basis in Jif, and the map C in ( fOl l encodes the dependence of all the nonzero moments in 
on the SDM S. The fact that the moments depend linearly on S allows the SDM to be chosen so as to satisfy given 
moment specifications for the corresponding PDF p in Q . Note that the moments Ep(p„,, with m ^ 0, are responsible for 
the deviation of the PDF p from the weighting function v. In particular, (fT2l i implies that the second-order relative Renyi 
entropy [23] of p with respect to v (with the latter playing the role of a reference PDF) can be expressed in terms of the 
moments in dH as 

^ 

£,m^'i3 

= ln(l+ Y. l(E^5)p). (16) 

fGO\{0} 

Here, the orthonormality condition Q is used in combination with the property that {Eo,S) = {In,S) = Tr5 = 1. 
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IV. EFFECTIVE PARAMETERS 

Note that the PDF p, defined by Q, can also be represented in the form 

pW = L (17) 

76 A 

which resembles the SOS structures [16], [22]. Here, Oj denote the eigenvalues of the SDM S (they are all real and 
nonnegative and satisfy L;gA ^7 = 1)> the functions 0^ : G —?> C are obtained by a unitary transformation of the functions 
(Pk as 

W •= L W, ; S A, X G G. (18) 

keA 

Also, U := {ujk)j,keA € is a unitary matrix whose columns are the eigenvectors of S, and hence, S = VLU*, where 

E ;= diagyg^((J,) is a diagonal matrix formed from the eigenvalues of S. The functions 9j, defined by (fTSI l. are also 
orthonormal elements of the Hilbert space in ([T]i, that is, {9j,9k),^ = Sjk- The representation (117b of the PDF p as a 
convex combination of the squares of the functions y^\9j\ involves not only the freedom of varying the coefficients Gj, 
but also the possibility of “mixing” the N elements (jOj. of the given orthonormal basis of in an arbitrary unitary fashion 
( fTSl l. Since the SDM S is Hermitian and has unit trace, then the total number of real parameters which enter the PDF p 
through S is dimH^r— 1 = — 1. However, not all of these parameters are active, in general. From the decomposition 

S = n(5) + n^(5) of the matrix S into the orthogonal projections n(5) and n^(5) :=5--n(5') onto the linear subspace 

^;=span{£^: fGG}, (19) 

spanned by the nonzero structure matrices in (|9|l, and its orthogonal complement respectively, it follows that n^(5) does 
not influence the PDF p in ([T2t . Here, G is a finite subset of Q. given by (ITTl i. Hence, the number of effective parameters 
of p does not exceed the dimension dim is of the subspace dT^ . that is, the rank of an appropriate Gram matrix: 

dim^ = mnk{{Ei,E,n)){^,neU- (20) 

The following lemma provides a condition for the nonsingularity of the Gram matrix, that is, the linear independence of the 
structure matrices Eg, with f G G, in the Hilbert space To this end, for any finite subset M C fi, we denote by 

■= spsin{(pi^: k£M} (21) 

the subspace of spanned by the corresponding set of basis functions. Also, let be a set of functions on G associated 
with the subspace as 

^A-={fg- (22) 

From the algebraic closedness (|7]i of the basis functions, it follows that 

J^ACJ^h. (23) 

Indeed, in view of the notation dZB, any two functions f,gG J^a are representable as / = «*<!> and g = b*<i> for some 
vectors G C^, where <I> is given by (l6]i. Therefore, dTOb implies that fg = = T.eeu^*^eb(pe G whence the 

inclusion ( 12^ follows in view of the arbitrariness of the functions /,g G J^a- 

Lemma 2: The linear subspace S’, defined by ( [T9] l. has full dimension dim^ = L if and only if the orthogonal complement 
of the set ^a, given by (i22l i. to the subspace consists of the zero function: 

^An^o = {0}. (24) 

Proof: As mentioned above in regard to (i20l i. the property dimS = L is equivalent to the linear independence of 
the structure matrices Eg, with £ G 13, in the space Now, a linear combination T^geU'^iEi of these matrices with 

complex coefficients q vanishes if and only if so does the quantity a*Y.gfzjjCeEgb ((^a*<ixp*b) =Ev {fgh) for 

all a,bG Here, / := a*^ G JSa, g '■= G JSa, and h := are auxiliary functions. Hence, the matrices 

Eg, with i G 13, are linearly dependent if and only if there exists a function h G J£^\{0} such that Ev{fg!i) = 0 for all 
f,gG GSSa, that is, h G This establishes the equivalence between the linear independence of these matrices and the 
condition (i24l i. ■ 
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V. HERMITE POLYNOMIAL AND EOURIER BASES 
Eor completeness, we will now specify the PDEs (|5]l for two classical Hilbert spaces of functions. Assuming that G := 

Hk{x) 


consider the functions 


(pk(x) := 




X • — {xj 


( 25 ) 


which are labeled by u-indices k := {kj)i<^j<^„ G =: Q. (with Z+ the set of nonnegative integers) and are obtained from 
the «-variate Hermite polynomials 

Hkix) := (-l)We2W'a^ViW'. (26) 

Here, use is made of the standard multiindex notation k\ := ki \ x ... x k„], \k\ := ki +... + kn, and := dx} ... 5*'', so that 
Hk = Hki ® . ■ ■®Hk„ is the tensor product of the corresponding univariate Hermite polynomials. The functions (l25l) form an 
orthonormal basis in the real Hilbert space := v) of real-valued square integrable functions, where the weight 

V is the «-variate standard normal PDE 


v(x) := 


X G . 


(27) 


(2;r)"/2 ’ 

The orthonormal basis {(Pk)k€i,i, given by (l25l l and (l26l l. satisfies a real-valued version of (|2]l. In view of [10, Eq. (3.13) on 
p. 28] and [19, Theorem 3.2.1 on p. 13], the structure coefficients (O are computed as 


ejke = 




j k-\- i 


(28) 


(m ——k)[{m —£)[’ 2 ’ 

for all «-indices j,k,i G Z']_ whose entries do not exceed the corresponding entries of m and such that j + k + i consists 
of even integers, with ejkg = 0 otherwise. In this case, the PDE p in (|5]l is specified by a real symmetric SDM S, and its 
representation (1121) in terms of the structure matrices from (|9l) takes the form 


p{x) = 




Sjk- 


Hj{x)Hk{x) 




e^2 




(29) 


in view of (i27l) . Here, A is a finite subset of Z'j_, and, in view of (i28] l. the corresponding set 15 in (ITTli is the Minkowski 
sum of the set A with itself: 15 = A +A := {j + k: j,k G A}. Although (12^ is organised as a truncated Hermite polynomial 
expansion, its coefficients {S,E() are parameterized by the SDM 5 in a specific way, which, according to Lemma[Tl guarantees 
that p is a legitimate PDE. As another example, consider the Eourier basis on the u-dimensional torus T" (where T is realised 
as a half-open interval [0,2;r)) consisting of the functions 

(Pkix):=e‘’‘^^, xGT,kGZr (30) 


These functions are indexed using the «-dimensional integer lattice Z" and form an orthonormal basis in the complex Hilbert 
space := (27r)^") with a constant weight (27r)^". The latter is the PDE of the uniform distribution over the torus. 

Since the functions (i30l l satisfy (PjT^ = (Pj-k, the algebraic structure coefficients in (|7|i are given by 


ejki = Sj-k,e, j,k,iGZ". 


In this case, in view of ([T2l i. the PDE p in (|5]) takes the form of a trigonometric polynomial 


p(x) 


1 

( 2 ;r)" 


j,keA 


1 

(2a:)" 


eeij 


(31) 

(32) 


where the set 15 = A — A := {7 — k : j,k G A} (which is the Minkowski difference of the set A with itself) is symmetric 
about the origin in Z". Again, Lemma [T] ensures that (l32l i describes a legitimate PDE on the torus for any SDM S. In 
the multivariate case being considered, the trigonometric polynomial p in (l32] | is not necessarily reducible to the squared 
absolute value of a single trigonometric polynomial. The Eejer-Riesz theorem [24] guarantees such a spectral factorization 
of a nonnegative trigonometric polynomial only in the univariate case. Erom the representation (ITtI i. with v = it 

follows that (I 32 I) is a mixture of N such factorizations: 


p{x) = {2%)-''Y^aj\dj{x)\\ (33) 

where, in accordance with (ITSl l. 67 (x) := are trigonometric polynomials. The PDE p in ([33]) reduces to a single 

Eejer-Riesz spectral factorization in the case of a rank-one SDM S = uu*, where u G is a unit complex vector. Recalling 
the analogy with the quantum mechanical density matrices mentioned above, such SDMs correspond to pure quantum states 
[25] which are extreme points of the convex set of mixed states. Therefore, this multivariate Eourier basis example shows 
that the SDM approach is, in principle, able to produce a wider class of PDEs than those obtained by squaring a single 
linear combination of elementary functions. 
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VI. QUADRATICALLY OPTIMAL SDM APPROXIMATION OF PDFS 
We will now consider a problem of approximating a given PDF f : G ^ K+ by the PDF p from (|5]). More precisely, 
assuming that the orthonormal basis in Jff and the set A are fixed, the SDM S is varied over the set (|4]i so as to minimize 
a discrepancy between the actual PDF / and the approximating PDF ps : = p (which is parameterized by S): 

D{f,Ps) —^ min, S € &n- (34) 

For example, one of such proximity criteria is described by a quadratic functional 

2 1 r {m-p{x)f 

2Jg v ( x ) 

where use is made of the Renyi entropy from (fTbl l. provided R(/||v) and R(/7||v) are finite, and the ^ factor is introduced 
for further convenience. A similar, though different, “mean integrated squared error” criterion is employed in the kernel 
density estimation; see, for example, [4], [6] and the references therein. In view of the linear dependence of ps on S, the 
quantity D{f,ps), given by (iTSl l. is a convex (but not necessarily strictly convex) quadratic function of S, which makes 
(O a convex quadratic minimization problem over the convex set of SDMs &n. We will therefore consider a “regularised” 
version of this problem: 

D{f,ps)—IJ.lndetS —min, Sg&n, (36) 

where /r > 0 is an additional parameter which weights the strictly convex function — lndet5, with the latter also playing the 
role of a barrier function for the SDM S to avoid singularity and hence, to remain positive definite. The resulting strictly 
convex minimization problem (l36] l has a unique solution which is described below. The following theorem employs a positive 
semi-definite self-adjoint operator £/, acting on the Hilbert space Hat and completely specified by the structure matrices (|9]l 
as 

£/{X) Y. {Ei,X)Ei, X e H^. (37) 

(eU 

We will also need a linear operator ^ which maps a PDF f: IR+ to a complex Hermitian matrix 

^if) := Y, EfCpiEe, (38) 

eeu 

with the latter being related to the generalized moments of /: 

Ef(pe = jj(pedx=(^^,(Pi)^. (39) 

The finiteness of these moments is guaranteed by the assumption R(/||v) < -foo in view of the orthonormality Q and the 
Cauchy-Bunyakovsky-Schwarz inequality. 

Theorem 1: Suppose f : G ^ R+ is a given PDF with finite Renyi entropy in ( fTbl i: R(/|| v) < -foo. Then for any given 
value of the barrier parameter ji >0, the optimization problem (l36] l. with the proximity criterion dTSl l. has a unique solution 
8)^0 which satisfies 

£/{S)-pS-^ =XlN + ^if). (40) 

Here, A G R is found from the normalization condition TrS' = 1, and the operators £/ and are defined by (|J7]) -(l39b. 

Proof: Application of the method of Lagrange multipliers to the strictly convex minimization problem (l36l) yields the 
following condition of optimality: 

- At lndet5 - A(Tr5 - 1 )) 

= dsD{f,ps)-XlN-piS-^=0. ( 41 ) 

Here, A G R is the Lagrange multiplier associated with the trace constraint in (IHi, and 

dsD{f,ps)= [ ips-m<P*dx (42) 

Jg 

is the Frechet derivative of D{f,ps) in dASl) as a composite function of the SDM S, where use is made of the pointwise 
Frechet derivative dsps = vOO* of the PDF ps in (|5]). By combining (fTOl i with the moments of the PDF ps in O, it 
follows that 


[ ps^^*dx = y EpCpiEi = 

(43) 

Jg 


•2G ieu 

(44) 


D{f,p) :=2 


J-P 
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Here, the operator defined by dTTl i. is the second-order Frechet derivative d^D{f,ps), and use is made of ( |38] |. ( |39] ). 
Substitution of (I42l i- (l44l i into djll i leads to (l40l i. The uniqueness of the pair (A,S'), satisfying (l40l i together with S 0 
and TrS = 1, is ensured by strict monotonicity of TrS and —with respect to S in the sense of the partial 

ordering on H^v induced by positive semi-definiteness. Indeed, (03) implies that the operator £/ is nondecreasing since 
£/(^X) = /g v(X,<I><I>*)T>T>*ck )>= 0 for any X G H^, while the map S is strictly increasing [9]. ■ 

The barrier parameter /r > 0 in (l36l l controls sensitivity of the optimal SDM S to the PDF / being approximated. In 
particular, lim^^+ooS = jjIn- At the other extreme, for small values of p, the SDM S can become nearly singular and highly 
sensitive to /. 


VII. SDM APPROXIMATION OF PDF DYNAMICS FOR MARKOV DIFFUSION PROCESSES 

The PDE ps in (|5ll, with the SDM S computed according to Theorem [T] can be used as a legitimate approximation 
for PDEs of random processes. More precisely, suppose /(f,-) : G —?> M+ is the time-varying PDE of a G-valued Markov 
diffusion process ^ (f) with an infinitesimal generator (so that = Ef(di(p + '!^{<p)) for any smooth test function 

(p : R+ X G —?► C; see, for example, [11]). Eor what follows, it is assumed that the Renyi entropy R(/||v) in (ITfil) remains 
finite. This can be studied by using the integro-differential relations 

d.e^ifW'’) = d,Ef^ = Ef(^^ (45) 

which follow (provided / and V are smooth enough) from the PDE dynamics governed by the EPKE 

o>r/= #(/), (46) 

where the adjoint is understood in the sense of the standard .5f^-space with the unit (or constant) weight. Then the 
SDM S of the corresponding approximation ps of / acquires dependence on time, which can be represented in the form 
of an ODE. The theorem below employs a self-adjoint operator Fg on the space Ha?, which is the Erechet derivative of the 
function of S on the left-hand side of (l40l) : 

Fs{X)=£/{X)+pS-^XS-\ XgHa,. (47) 


Due to the assumption that p > 0, the operator Fs is positive definite and strictly increasing for any given 8)^0. These 
properties of Fs are inherited by its inverse F^^. Also, let X be a linear operator which maps the PDE / to the complex 
Hermitian matrix 


K{f) := 52 

ten 


(48) 


where is the generator of the Markov diffusion process and use is made of the structure matrices (|9l). 

Theorem 2: Suppose the PDE of the underlying Markov diffusion process has finite Renyi entropy R(/||v) in (fTfil l 
at every moment of time f ^ 0. Then the SDM S, associated with / according to Theorem [T1 satisfies the ODE 


S = F,\K{f)) 


TrF-‘(X(/)) 

TriV'(/^) 


Fs\In). 


where the operators Fs and K are defined by (03 and (03- 

Proof: By taking the time derivative on both sides of (03, it follows that 


Fs{S) = IIn + = llN + K{f), 


(49) 


(50) 


where use is made of (03 and a combination of (l38T l with (03- The invertibility of the operator Fs allows dSOl l to be solved 
for S as 

S = XFs\lN)+Fs\K{f)). (51) 

Since Fs is strictly increasing, then Ff^il^) >- 0. Hence, TrFf^{If^) > 0, and X in (fSTT i can be uniquely found so as to 
satisfy the condition Tr5 = 0 (which comes from the preservation of Tr5 = 1 in time): 


TrF-‘(X(/)) 

TrFf^il^) 


(52) 


By substituting (153 back into (ISTT i. it follows that the SDM S is governed by (09b . ■ 

The right-hand side of (03 is linear with respect to the PDE / and depends on 5 in a rational fashion. The latter follows 
from the representation 

vcc{Ff^{X)) = (^wec{Ei)'vec{Ei)* + pS^^ vec(X) (53) 

^ten 
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obtained by applying the vectorization vec(-) of matrices [18] to ( ITtI ) and WT\ and using the relation = S in view of S 
being Hermitian, with 0 the Kronecker product. If / is an invariant PDF of the Markov process ^, then (l48l l implies that 
K{f) = 0, and (l49T l yields 5 = 0 in accordance with the static setting considered in Section[Vl] Now, for nonlinear stochastic 
systems, the solution / of the FPKE (I46I) is usually not available in a closed form. In this case (when approximations of 
/ are particularly important), the right-hand side of (09]) can be evaluated by replacing the unknown PDF / with its SDM 
approximation ps, which leads to 

S = (54) 

Ir/^^ w) 

Here, 2 is a linear operator on the space Ha^, obtained by substituting the PDF ps from (ITSli into the operator K in (l48T i: 

Q{S):=K{ps)= Y,^Ps'^i(P>n)Em 

= ^ (55) 

e,meU 

The right-hand side of the ODE (l54l) is a rational function of the entries of S (thus resembling the Riccati equations), which 
can be represented in the vectorised form by combining (l5^ with 

vec(g(5))= Y, ( 9 ’£,^( 9 m))jr''ec(£'m)vec(£'£)*vec( 5 ), (56) 

following from ( 1551 ). The constant matrices in (1531 ) and ( l56l l can be precomputed using the structure matrices (|9]) and the 
matrix elements of the generator over the basis. By construction, the ODE ( |54] | has a positive definite solution whose 
trace is preserved in time. Therefore, being a closure of ( 1491 ), the SDM dynamics produce a legitimate approximate solution 
PS of the EPKE (06]). 


VIII. SMOLUCHOWSKI SDE ON MULTIDIMENSIONAL TORUS 

We will now consider an application of the SDM approach to the approximation of PDEs for random processes governed 
by nonlinear SDEs. As an illustrative example, we will use a version of the Smoluchowski SDE [ 8 ], [11], [31] on the 
n-dimensional torus T": 

d^{t) = -VV[^[t))At + adW[t). (57) 


Here, ^ := {t))t^o is a T"-valued Markov diffusion process, and V : T" —)■ R is a twice continuously differentiable function 

with the gradient W := {dx,^V)i^k^„ : T" —R", where dx^ (•) is the partial derivative with respect to the kth angular coordinate. 
Also, <7 > 0 is a scalar parameter, and W is an n-dimensional standard Wiener process. The state space in the form of a 
multidimensional torus corresponds to systems with rotational degrees of freedom. The SDE 07]) (which is understood in 
the Ito sense) is a noisy version of the gradient descent for the function V and is employed, for example, in the simulated 
annealing algorithm of stochastic optimization [13]. The function V can be interpreted as the potential energy of a dissipative 
dynamical system in contact with a heat bath at temperature T >0 which specifies the noise level a in 07]) as <7 = s/2k^T, 
with the Boltzmann constant. This interpretation is motivated by the fact [11] that the invariant probability measure for 
the Smoluchowski SDE 07]) is absolutely continuous with the Gibbs-Boltzmann PDF 




xeT, p ;= 



(58) 


Here, Z(j3) := fj„ is the statistical mechanical partition function [20] of the auxiliary parameter j3 = associated 

with the potential V. The invariant PDF /* in (08]) is a unique equilibrium point of the FPKE (06]) for the PDF /(f, •): T" —R+ 
of the random vector ^(f), which takes the form 


a,/ = div(/vy) + ^A/. 


(59) 


Here, div( ) and A( ) are the divergence and Laplace operators over the spatial variables. The corresponding generator ^ 
acts on a smooth test function 9 : T” —C as 


^{(p) =-'VV^'V(p + ^A(p. (60) 

Although the invariant PDF /* admits a closed-form representation (081 ). the time evolution (091 ) of / towards the equilibrium 
can be complicated, and it is especially so for multiextremum potentials V. For example, in the context of molecular dynamics 
simulation, such V can represent fractal-like energy landscapes (as functions of the dihedral angles in protein macromolecules) 
which are considered to play an important role in relaxation phenomena associated with protein folding [17]. This motivates 
investigation of the PDF dynamics for such FPKEs. 





IX. FPKE IN SPATIAL FREQUENCY DOMAIN 

We will now reformulate the FPKE (|59] | in the spatial frequency domain using the fact that all functions on the torus 
T" are 27r-periodic with respect to the angular variables and (under the assumption of square integrability underlying the 
Hilbert space := (27r)^")) can be represented by Fourier series. Let 

y{x)= Y.^k(Pkix), f{t,x)= fk{t)(pk{x) (61) 

ytGZ" keV 

be the Fourier series for the potential V and the PDF /(f,-) over the corresponding basis ( l30b . Since the functions V and / 
are real-valued, their Fourier coefficients satisfy the Hermitian property VC*. = I 4 f_k = fk for all k and any time 
t. The normalization condition for the PDF /(f,-) is equivalent to 

Mt) = (271)-’'[ f{t,x)dx={27t)-", (62) 

Jj" 

Since the potential V is assumed to be twice continuously differentiable, ||AU||oo := maXjozjn |AU(x)| < + 0 °. Furthermore, if 
V is inhnitely differentiable, it can be shown that the FPKE ( |59] ) has a smooth fundamental solution [28]. Similarly to the 
energy estimates for parabolic PDFs [7], the norm ||/(f,-)l|j^ any time f > 0 is amenable to an upper bound in terms 
of 11/(0,-) 11^. This is obtained by applying the Gronwall-Bellman lemma to the following differential inequality which is 
given here for completeness. 

Lemma 3: The PDF /, governed by the FPKE ( |59] | for the Smoluchowski SDE (|57] | with an infinitely differentiable 
potential V, satisfies 

(ll/ll^)' ^ (l|Ay||oo-(T2)||/||^ + (2;r)-2"(j2. (63) 

Proof: A combination of ( |59] |. (l6^ with the identities fW^Vf = 5 (div(/^VU) — f^AV) and /A/ = jA{f^) — |V/p 
(and the property that / and V are real-valued) leads to 

(ll/ll^)‘ = 2 = 2 

= {AV,f)^-a^\\Vf\\l^ 

<l|AU||oc||/||^-(T2||g||^. (64) 

Here, the first two equalities are, in fact, a particular case of (l45l l since the Hilbert space .L'iP being considered employs a 
constant weight. Also, use is made of the Poincare-Wirtinger inequality 

||V/||^ = (27r)-"/ |V/|2ck= Y 

/tGZ"\{0} 

^\\g\\le=\\f\\lt-fl (65) 

applied to the function g\= f — fo which satisfies gck = 0 in view of (l62l i. Substitution of the last equality from dhSl) 
into (l64l i leads to (l6^ . ■ 

Another upper bound for the Renyi entropy R(/||v) is provided by the inequalities 

pR(/l|v) 

^ sRifWL) ^ 

(27r)"maX;,eT«/*(x) ^ 

eR{/(0Allv) 

(2;r)”min;ceT«/*(•«) ’ 

where both denominators are finite and strictly positive due to (fSST i and the continuity of V over the torus. The second of 
these inequalities follows from the dissipation relation 

= 2Ef^{h) = 2E/,(/j^(/j)) 

= f 

= -(j2E^.(|V/r|2)<0, (66) 

whereby R(/(f,•)||/*) is a nonincreasing function of time f ^ 0. Here, h := is the PDF of the Markov process ^ with 
respect to the invariant measure (so that the first equality in (l66T l is similar to (l431l). and use is made of a change of measure 
together with the relations = 0 and 2h^{h) = — a^\Vh'^, with the latter following from d^ . Now, dMI l allows 

the matrix elements of the generator over the Fourier basis (l30l) to be computed as 

(j2 

= Vi^,n{l-m)^m- —5hn\m\^ (67) 
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for all £,m G Z". Hence, in addition to ( |62] ). the Fourier coefficients of the PDF / satisfy a denumerable set of ODEs 

fj = {(Pj,^\f))^= Y, {'!^{(pj),(pk)j^fk 
keZ" 

= -f E W;-i - V \j?fh j e Z" \ {0}, (68) 

kew ^ 

which represent the FPKE (l59l l in the spatial frequency domain. Here, the convolution sum comes from the term /VV and 
is responsible for the coupling of these ODEs (the trivial case of a constant potential V is not considered). The truncation of 
the set of ODEs (l68l l (for example, in accordance with Galerkin’s method for parabolic PDEs [7]) leads to an approximate 
system of equations whose solution is not necessarily nonnegative and does not correspond to a legitimate PDE on the torus. 
This issue can be overcome by using the SDM approximation of PDE dynamics described in Section IVTIl 

X. SDM APPROXIMATION EOR SMOLUCHOWSKI SDE 
In application to the Smoluchowski SDE (ISTT i being considered, the SDM dynamics (l54l i can be implemented in the 
vectorised form ^ 

vec(5)- = vec(f7'(e(5))) - vec(F7^(/^)) (69) 

using (l53l l. (l56l l together with the structure coefficients OTl i and the matrix elements over the Eourier basis dinil. The 
corresponding SDM approximation (l32l i of the PDE / in (l59l l is 

ps{x) = (2;r)-"C(5)*‘P(x), (70) 

where the maps C in (fTsT i and 'P : T" —> in (fl4l i are given by 

C(5)=( E (71) 

^ j,keJk:j-k=t 

Here, N = (2r+ 1)" and L = (4r+ 1)" in the case when the set A and the associated set 13 in (fTTT i are the discrete cubes 

A:=i[-r,r]f)Zr, O = ([-2r,2r] (72) 

with r a positive integer. We will now provide results of a numerical experiment on the SDM approximation of PDEs 
for the Smoluchowski SDE (l57l i with O’ = 1 in the two-dimensional case n = 2. The potential V was generated as a 
trigonometric polynomial V(x) := ^ = 'LkeZ'^:\k\^R\^k\cos{k^x + (pk) of .x G with R = 5 (that is, 41 

independent harmonics) and exponentially distributed random amplitudes \Vk\ = |V-i:| and initial phases (pk : = argV). = —<p-k^ 
with the latter being uniformly distributed over the interval [0,27r) for any k ^ 0; see Eig. [T] The SDM approximation (iTOl) 



Fig. 1. A multiextremum potential V generated as a trigonometric polynomial of the spatial variables 0 ^ xi,JC 2 < 2 k with random coefficients. 


was produced using (l6^ and (ItTI i with the sets (f72l) with r = 2,N= (2r-|- 1)^ = 25, L = (4r+ 1)^ = 81 and /r = 0.01. The 
initial conditions for the actual PDE / and the SDM S were /(O,-) = ^ and 5(0) = with the latter corresponding to 
the same uniform distribution over the torus with the PDE p = The actual PDE / was obtained through the standard 
finite-difference numerical solution of the EPKE on a mesh of 10^ points with time step 0.002. The evolution of the relative 
values of the quadratic proximity criterion (l35T l (with D(/,0) being the squared .if^-norm of /) is presented in Eig. |2] 
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Fig. 2. The relative error of the SDM approximation in the course of evolution of the actual PDF /. 


which shows that the relative error remained within 1.4%. That the time interval captures the essential part of the movement 
of the system towards equilibrium is seen from Fig. [2 whereby both the actual PDF / and its SDM approximation ps at 
the end of simulation were close to the invariant PDF /*. 



Fig. 3. The invariant PDF /* (semitranslucent surface), the actual PDF / (black wireframe) and its SDM approximation ps (red wireframe) after 2000 
steps of evolution. 


XI. CONCLUSION 

We have outlined an approach to the approximation of PDFs by quadratic forms of weighted orthonormal basis functions 
parameterised by SDMs. The SDM approximation produces a legitimate PDF which satisfies the normalization condition and 
is nonnegative everywhere. For orthonormal bases with an algebraic structure, we have provided an optimization procedure 
for the SDM approximation using a quadratic criterion. The optimal SDM approximation has been extended to PDFs of 
Markov diffusion processes, and an ODE has been obtained for the SDM dynamics which yields a legitimate approximate 
solution of the FPKE. This has been demonstrated for the Smoluchowski SDE with a multiextremum potential on a torus. The 
SDM approach is also applicable (in the spirit of projective filtering [29]) to the approximation of posterior PDEs governed 
by the Kushner-Stratonovich equations [14]. In this application, the conditional SDM would play a part similar to that of the 
covariance matrix computed in the Kalman filters. However, unlike the covariance matrices, the SDM is not restricted to the 
second moments and involves higher order moments of the system variables. Numerical integration of the SDM dynamics 
can be implemented in a square-root form using the smoothness of the Cholesky factorization of positive definite matrices 
[27]. Error analysis of the SDM approximation is another line of research to be tackled in future publications. 
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